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Abstract 

The paper presents a prototype lexicalist Machine Translation system 
(based on the so-called 'Shake-and-Bake' approach of Whitelock (1992|) ) 
\ consisting of an analysis component, a dynamic bilingual lexicon, and a 

generation component, and shows how it is applied to a range of MT 
problems. Multi-Lexeme translations are handled through bi-lexical rules 
I/-) , which map bilingual lexical signs into new bilingual lexical signs. It is 

| argued that much translation can be handled by equating translationally 

bi}' equivalent lists of lexical signs, either directly in the bilingual lexicon, or 

by deriving them through bi-lexical rules. Lexical semantic information 
organized as Qualia structures ( [Pustejovsky 1991 ) is used as a mechanism 



for restricting the domain of the rules. 



X ■ 1 Introduction 



Transfer based approaches to machine translation (MT) involve three 
main phases: analysis, transfer and generation. During analysis, 
the syntactic and semantic structure of a sentence is made explicit 
through a source language (SL) grammar and semantic processing 
modules. The result of analysis is one or more syntactic and semantic 
representations which are used to construct a syntactic and/or se- 
mantic representation in the target language (TL) through a series 
of transfer rules and a bilingual lexicon. From this representation a 
TL sentence is generated based on some form of mapping procedure, 
usually exploiting the TL grammar Q. 



1 While this definition of transfer systems is current in most MT discussions, it has been 



challenged (Kay et al. 1994) on the basis that the interlingua-transfer distinction, that is, the 
distinction between systems which construct language independent representations and systems 
which do not, is artificial and that in fact the two paradigms simply represent different aspects 
of the same problem. While we agree with this observation, many systems at present start with 
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In this paper we describe a prototype implementation of a transfer 
MT system based on the lexicalist MT (LMT) approach of [White 



lock (1992 ), also known as 'Shake-and-Bake' (SB). For our implemen- 
tation we have extended the original SB formulation by postulating 
bilingual lexical rules (bi-lexical rules henceforth) which dynamically 
expand the bilingual lexicon in order to extend its functionality. This 
allows us to uniformly treat mono- and multi-lexeme translations in 
a variety of contexts. 

We describe the main characteristics of the LMT approach. This 
is followed by a description of the problems posed by certain multi- 
lexeme translations, and of how bi-lexical rules, in conjunction with 
lexical semantic information provide a framework for overcoming these 
problems. We then point out some limitations in our approach and 
give some idea as to the status of our implementation. 



2 Lexicalist Machine Translation 

In its original formulation, LMT consists of three main phrases: anal- 
ysis, lexical-semantic transfer and generation. The analysis phase 
involves parsing the input sentence to produce an output bag or 
multiset of SL lexical signs instantiated with sufficient information 
to permit appropriate translation. Transfer maps these signs into 
a TL bag through the bilingual lexicon in which sets of source and 
target lexical signs are placed in translation correspondence. Gen- 
eration consists of finding an ordering of the TL bag which satis- 
fies the constraints imposed by the TL grammar. Normally, gener- 
ation involves a modified parser which ignores ordering information 



rew 1992| ; Popowich 1995|) although other approaches are also pos- 



sible flPoznanski et al. 1995 ). 



2.1 Notation 

We introduce some notation through a simple example of our imple- 
mentation. Since we will not be concerned with quantification nor 
scoping, we adopt a simplified transfer representation. If quantifi- 
cation and scope were to be included, however, a mechanism along 
the lines of Frank and Reyle (1995 ) and Gopcstake et al. (1995 ) may 



be followed in order to preserve the recursiveless nature of lexicalist 
transfer. 



an interlingua or a transfer architecture and then incorporate solutions from the alternative 
paradigm. We therefore maintain the distinction, at least for the purposes of this paper. 
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Our lexical signs broadly follow the signs of [Pollard and Sag (1987| ) 
although our work seems adaptable to the signs of Pollard and Sag ( 1994]) 
The implementation is based on the Typed Features Structures (TFSs) 
of the Acquilex LKB QCopestake et al. 1993| ) from where we borrow 
our notation. Consider the (simplified) lexical entry for 'John': 



proper- name 
ORTH = John 
r syn 

AGR = 



3sg 



QUALIA 
SEM 



qualia 



johnl(x) 

In this TFS, features are written in small capitals, while types are 
in bold face. To make TFSs easier to read, detail may be hidden by 
'shrinking' a TFS; this is indicated with a box around the type of the 
TFS 



e.g. qualia 



above). TFSs of type qualia encode lexical semantic 
information based on the Qualia structures of [Fustejovsky (1991| ). For 
the semantic representation of proper names we assume a predicate 
treatment following the arguments of pevlin (1991| :225). A bilexical 
entry for 'John - Juan 1 would be: 



proper- name 
Orth = John 



SYN 



syn 

AGR 



3sg 



QUALIA 
LANG = 

SEM = johnl(x) 



qualia 
english 



proper- name 
ORTH = Juan 



SYN 



syn 

AGR 



3sg 



QUALIA 
LANG = 

SEM = juanl(x) 



qualia 
Spanish 



For reasons of space and convenience, we will abbreviate the above 
lexical sign and bilexical entry to 

johnlj, 

johnlj; «-> juanl^ 



respectively, where the subscripts correspond to the argument vari- 
able. It should be emphasised, however, that this abbreviated no- 
tation implicitly includes syntactic and semantic information which 
may be accessed during transfer or generation. 

To exemplify LMT, consider the translation of 'John likes Mary'. 
Analysis results in a listf] of lexical signs the semantics of which will 
contain shared variables: 
johnlz lovelg^j, maryl^ 

The (tenseless) FOL formula corresponding to this expression is 3exy. 
johnl(x) & lovel(e, x, y) & maryl(y), but since quantification and 
scope will be ignored they will be omitted from our examples; fur- 
thermore, coordination will be assumed between predicates unless 
otherwise stated. 



2 We use lists of SL lexical items, instead of bags as is done in SB, to avoid certain inefficiencies 



caused by the nature of lexicalist transfer (Garcy and Johnson 1979:22i) 
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Before transfer, a process similar to skolemization is applied to 
the transfer representation in order to replace variables by constants. 
The purpose of this operation is to prevent spurious bindings during 
lexicalist generation, as will become clearer later. The result of anal- 
ysis is a list of lexical signs with translationally relevant relationships 
expressed by shared constants (indicated by integers in our notation): 

johnli lovel2,i,3 maryl3 

The transfer step uses the source side of the bilexicon (possibly ex- 
panded by bilingual lexical rules as described below) to derive a total 
cover of the SL list ( Garey and Johnson 19791 :221) (a total cover is a 



division of a set into a number of allowed subsets such that every ele- 
ment in the set is a member of exactly one subset; we extend the term 
here to apply it to lists). The bilexicon below enables construction of 
an appropriate TL bag: 

johnl^ «-> juanl^ 
maryl-r <-> marial^ 
lovely, z <-> amarlx^z al z 

(Tense is omitted in this example; a simplistic model has been adopted 
in which an interlingua tense feature is passed from source to target 
verbs in the bilexicon.) Note that we include function words such 
as the Spanish case marker a in the bilingual lexicon (and therefore 
in the transfer representation). These words are treated as vacuous 
predicates flCalder et al. 1989]) over the variable of the semantic head 
on which they depend. For the present example, transfer results in 
the following TL bag: 

{juanli , amarl.2,1,3 , a3 , marials} 

Lexicalist generation involves reordering the TL bag to construct a 
valid TL sentence. Since normally all permutations of the TL bag are 
attempted, the fact that variables are replaced by constants ensures 
that arguments not shared between predicates in the SL representa- 
tion are not shared in the TL representation either. This prevents 
Maria from being the subject of the sentence. The result of genera- 
tion, after morphological synthesis, is: 

Juan ama a Maria 



2.2 Other Properties of LMT 

LMT encourages two useful properties: modularity and reversibility. 
From an engineering point of view, modularity is desirable because it 
can reduce development and maintenance costs. By using sets of lexi- 
cal signs as their transfer representation, LMT systems can reduce the 
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difficulties posed by structural mismatches between two languages, 
thus increasing the independence between source and target transfer 
representations. For example, transfer systems adopting a recursive 
representation for transfer ([Kaplan et al. 19891) , as opposed to a non- 
recursive one flCopestake et al. 1995[ ), may need additional mecha- 
nisms for handling head switching ( |Kaplan and Wedekind 1993| ). By 
contrast, under a lexicalist approach, head switching can be handled 
purely compositionally with minimal assumptions (|Whitclock 1992| ). 

Reversibility is an important property in bi-directional systems as 
it reduces development costs. In LMT, grammars are fully reversible 
since they are used in similar ways for analysis and generation: the 
difference is that during lexicalist generation, ordering information is 
disregarded. However, the process is complete because the generator 
is guaranteed to generate all the strings accepted by the TL gram- 
mar which satisfy the constraints imposed by the TL bag. Lexicalist 
generation is also sound because only strings which satisfy the con- 
straints of the TL grammar are constructed. In addition, termination 
is guaranteed if it is guaranteed for parsing since one can at worst 
construct a generation algorithm which simply attempts all permu- 
tations of the TL bag and then parses them in order to test whether 
they are appropriate TL sentences. 



3 Multi-Lexical Translations 

One of the reasons for transfer modules being expensive to construct 
is the presence of complex transfer relations (|Arnold and Sadler 1992| ; 



|Hutchins and Somers 1992[ ). One type of phenomena that leads to 



complex transfer in a number of systems may be called multi-lexical 
translation. These are translations in which a phrase cannot easily 
be translated through the translation of its parts. The translation 
of idioms is an extreme case of this. For example, 'kick the bucket' 
translates as estirar la pata (Lit. 'to stretch a leg') in Spanish, even 
though there is no simple correspondence between the components of 
each phrase (all translations in this paper are between English and 
Spanish unless otherwise stated). For such constructions, structures 
corresponding to the source and target phrases need to be equated 
either in the transfer module ( |5chenk 1986| ) or in separate dictio- 
naries ( [Sadler et al. 1990| ) in many systems. Other phenomena which 
may be loosely labelled multi-lexeme translations include: lexical gaps 
such as 'piece of advice' - consejo ( [Soler and Marti 1993| ); support 



verb and category differences such as 'to be thirsty' - tener sed (to 
have thirst) ( panlos and Samvelian 1992[ ); lexicalization patterns like 
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'swim across the river Dee' - cruzar el no Dee nadando flTalmy 1985| ); 
conflational divergences as in 'to stab someone' - darle punaladas a 
alguien ( Dorr 1992 ). 



Phenomena such as idioms, lexical gaps and conflational diver- 
gences can be tackled in LMT by equating sets of source and target 
lexical signs: 

a) kickle^o, thel , bucketl G «-> estirarl e ,s,o) lal D , patal D 

b) piecelz, ofl^, advicelj, «-► consejol^ 

c) stabl ejSi0 <-> darl eiSjPj0 lel punaladal p al D 

(We include lexical signs for determiners, clitics and accusative mark- 
ers as predicates over the variable of their syntactic head; however, 
reasoning formalisms may dispense with them.) Note that we choose 
the variable of 'piece' on the English side as the argument variable 
on the Spanish side; if phrases such as 'a piece of good advice' are 
allowed, the Spanish side would be consejol xUy , whose semantic ar- 
gument would be unifiable with both x and y to permit modifiers and 
heads to combine appropriately during generation. 

To translate 'John kicked the bucket', the SL transfer representa- 
tion: 

johnli kickl.2,1.3 thel 3 bucketl 3 

is covered by the bilexicon. The result is the union of the target side 
of all the bilexical entries used in this process: 
{juanli} U {estirarl2,i,3, lal.3, patal-3} 

(We ignore the literal translation of the idiom.) Generation then 
proceeds via the Spanish grammar and bag generator. 

In the case of the other multi-lexeme translations mentioned the 
difficulties posed by varying lexical elements in part or all of the trans- 
lation relation cannot be easily handled in the original SB formulation. 
Consider for example the case of 'John is thirsty'; its Spanish transla- 
tion, Juan tiene sed (lit. 'John has thirst') differs from it in two main 
ways: the English adjective translates into a Spanish noun, while the 
verb is not intuitively felt to be the translation of tener. The prob- 
lem for LMT based on one-to-one transfer is that a literal translation 
into Spanish is incorrect (*Juan estd sediento), and that even if TL 
filtering ( |Alshawi et al. 1992Q were used to eliminate such a sentence, 
the efficiency of the system would be compromised and translation 
of unseen sentences would be more error prone. Alternatively, an 
idiom-based translation in which the bilexicon relates 'be thirsty' and 
tener sed ignores important systematic differences between the two 
languages: 
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John is thirsty Juan tiene sed 

John is hungry Juan tiene hambre 

John is lucky Juan tiene suerte 

John is angry Juan tiene rabia 

John is hot Juan tiene calor 

John is cold Juan tiene frfo 

We therefore argue that a one-to-one translation for such phrases 
is not adequate but instead consider the highlighted phrases above as 
the correct equivalences between the two languages. The task then, 
is to find a mechanism for efficiently capturing regularities of this sort 
in the present framework. There are a number of alternatives for 
achieving this. We will consider three. 

3.1 Lexical Neutralization 

The first possibility for handling multi-lexeme regularities in LMT is 
to eliminate support verbs from the SL transfer representation alto- 
gether, and to reintroduce them during generation. In this case, a 
semantic representation for the sentences must be proposed. For the 
sake of argument assume an adjective-like intersective semantics for 
both the Spanish nouns Juan and sed and the corresponding English 
noun and adjective: 

SL: johnli thirstyli 
TL: juanl! sedli 

Then, the bilexicon would include, among other things: 

thirsty]^ «-» sedlj. 
hungry l x <-> hambre l x 
etc. 

Lexicalist transfer would apply these equivalences to construct an 
appropriate TL bag. During Spanish bag generation, the appro- 
priate support verb (i.e. tener) would be introduced by inspection 
of monolingual lexical information associated with sed ( panlos and| 



Samvelian 1992| ), from which correct instantiation of the orthography 
of the TL sentence would ensue. A variation of this strategy would 
be to use a partially instantiated lexical sign corresponding to the 
English support verb: 

{ johnli , support-vert^, 1,3 , thirsty3 } 

During transfer, the support verb is translated as a partially instanti- 
ated support verb in Spanish. The generation algorithm would then 
be applied such that monolingual constraints in the Spanish grammar 
fully instantiated the semantics and orthography of this verb accord- 
ing to the support verb requirements of its complement noun. 
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3.2 Lexical Variables 



The second mechanism for capturing multi-lexeme regularities as- 
sumes translation variables similar to those used in several transfer 
systems ( |Alshawi et al. 1992| ; |Bech et al. 199 lj ; |Russell et al. 199T 



If one represents transfer variables by tx(<restrictions>) , then the 
necessary bilexical entry would be: 

bel x ,i(,z) tr(Adj z ) «-» tenerl^y^, tr(Noun z ) 

This entry states that 'be' translates as tener as long as its com- 
plement adjective translates as the complement noun of tener. The 
transfer algorithm is modified to accommodate the transfer variable 
by, for example, recursively calling itself on the value of tr(Adj 2 ). 
Generation, however, proceeds as before. A variation of this mecha- 
nism is to use contextual rather than transfer variables. In this case, 
a particular lexical context is specified which constraints translation 
equivalence in a manner analogous to the way left and right contexts 
are used in morphological rewriting rules ( |Kaplan and Kay 1994| ). 
Thus, the transfer relation 

bel X)2/>2 , (Adjz) <-> tenerl^j^, (Noun 2 ) 

would indicate that in the context of an adjective complement, 'be' 
may translate as tener or vice versa. The main difference between this 
and the transfer variable variant is that the contextual elements, Adj 
and Noun, can serve as context to multiple transfer relations within 
the same cover, whereas this would not be possible with transfer vari- 
ables. We will appeal to contextual variables in Section |5|. 

The third mechanism uses bilingual lexical rules to map bilexical 
entries into new bilexical entries. We have adopted this mechanism 
for certain multi-lexeme translations because it allows the exploita- 
tion of monolingual lexical rules in a motivated manner which inte- 
grates naturally with the LMT architecture, and because it provides 
a framework in which to study differences between lexical processes 
in different languages. 



4 Lexical and Bi-Lexical Rules 



The lexicon has taken a prominent place in several linguistic theo- 
ries (Pollard and Sag 1994j ; |Oehrle et al. 1988|) , not least because, 
given appropriate tools, both general and idiosyncratic properties 
of language can be captured within a uniform framework. Among 
the tools normally employed one finds lexical rules ( |Dowty 1978| ; 
[Flickinger 1987| ; Pollard and Sag 1994Q and inheritance mechanisms 
( |Briscoe et al. 199"5| ; |l'lickinger and INerbonne 1992| ) . Lexical rules may 
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be thought of as establishing a relationship between lexical items such 
that given the presence of one lexical item in the lexicon the existence 
of a further item may be inferred. The regularities captured by lexi- 
cal rules might include changes in the subcategorization and control 
properties of a verb, the denotation of a noun or the interpretation 
of a preposition. With the advent of lexically oriented approaches to 
translation, it is worth considering whether and how the generaliza- 
tions captured by lexical rules might be exploited in MT. 

In order to investigate this issue we have adopted the notion of a 
bi-lexical rule. A bi-lexical rule flTrujillo 1992| ; |Copestake et al. 1993| ) 
takes a bilexical entry as input, and outputs a new bilexical entry. 
These rules may be seen as expanding the bilexicon in order to increase 
its coverage; under this view, they are somewhat analogous to lexical 
rules in that they reduce the number of bilexical entries that need 
to be explicitly listed. Bi-lexical rules also serve to capture lexical, 
syntactic and semantic regularities in the translation between two 
languages by relating equivalent lexical processes cross-linguistically. 



4.1 Simple Bi-lexical Rule 



We give a simple example of a bi-lexical rule before addressing the 
multi-lexeme translations introduced earlier. Consider the relation- 
ship that exists in English-Spanish translations between the transla- 
tion of fruits and the translation of their corresponding trees (|Soler| 
and Marti 19931): 



Fruit 


English 


Spanish 


almond 


almendra 


apple 


manzana 


cherry 


cereza 


orange 


naranja 


plum 


ciruela 


lemon 


limon 



Tree 


English 


Spanish 


almond tree 


almendro 


apple tree 


manzano 


cherry tree 


cerezo 


orange tree 


naranjo 


plum tree 


ciruelo 


lemon tree 


limonero 



The relevant relationship may be described by the following bi- 
lexical rule: 
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common- noun 
ORTH = orth 
SYN = syn 
QUALIA 

LANG = english 
SEM = pred(x) 



fruitl(x) 



common-noun 
ORTH = orth 
SYN = syn 

QUALIA = fruitl(x) 
LANG = Spanish 
SEM = pred(x) 



noun-noun 



a- 



fruit-tree 



common- noun 
ORTH = orth 
SYN 



syn 

QUALIA = fruitl(y) 
LANG = english 
SEM = pred(y,z) 



common- noun 
ORTH = tree 

SYN 



syn 



QUALIA = treel(z) 
LANG = english 
SEM = treel(z) 



common-noun 

ORTH = orth + MORPH 

SYN = | syn 



treel(z) 



QUALIA 

LANG = Spanish 
SEM = pred(z) 



This bi-lexical rule says that if there is a bilexical entry translating 
English fruit nouns into Spanish fruit nouns, then there is a bilexical 
entry translating 'noim tree' in English into a morphologically derived 
tree-denoting noun in Spanish. 

We adopt Qualia structure ( pustejovsky 199TD as our lexical-semantic 
representation formalism. According to Pustejovsky, Qualia structure 
is one of the four main types of information to be associated with a lex- 
ical entry (the others being Argument, Event and Inheritance struc- 
ture). The information incorporated in a Qualia structure specifies 
the semantics of a lexical item by virtue of the relations and proper- 
ties in which it participates. For this example we assume a simplified 
Qualia value ( pustejovsky 199"T| ) indicating whether a noun denotes 
a tree or a fruit. Note that the morphology of the output Spanish 
lexical sign is left implicit since it depends on the actual noun used 
(see fruit-tree table above); in addition, the English rule mapping a 
noun into a noun modifier is a practical simplification of the complex 
issue of noun-noun modification which we do not address here ( Puste-| 
jovsky and Boguraev 1993| ; IJohnston et al. 1994j) . Another point to 
note is that we will be vague regarding the amount of information 
shared between the input and output lexical signs of lexical rules; 
a full treatment of this issue involves aspects of default unification 
which are beyond the scope of this paper ( [Meurers 1994| ; [Lascaridei 
in pres"s| ). Suffice it to say that in our implementation, an at- 



et al. 



tempt has been made to share maximum information between input 
and output lexical signs, although values such as semantic variables 
are not shared between input and output lexical signs. 

In the abbreviated notation introduced earlier, the above bi-lexical 
rule will be represented as: 

identity 1J- fruit-tree 



No 



trccl. 



NsC 



Given the translation 'apple 
operate as indicated below: 



manzana 



for example, the rule would 
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applel x «-> manzanal x 

JJ- identity JJ. fruit-tree 

applely. 2 treel z «-> manzanol z 

Its output is the additional translation relation 'apple tree - manzano' . 
Similar translations are achieved for other fruits. 

Clearly this rule should only apply to fruits which grow on trees 
and not to fruits such as strawberries which are found on low growing 
plants. Such restrictions need to be incorporated in the monolingual 
lexical signs and rules. 

Implementationally, bilexical rules may be applied off-line in order 
to expand the bilexicon before processing, or they may be applied dur- 
ing transfer to extend the bilexicon just sufficiently to enable transfer. 
We have opted for the latter approach. 



4.2 Support Verbs 

We now show how bi-lexical rules can be used in the translation of 
'thirsty', basing our analysis on the classification of support verbs 
proposed by panlos and Samvelian (1992| ) for English-French trans- 



lation. Their proposal, implemented as part of a Eurotra project, 
involves transfer at the Interface Structure. The essence of their ap- 
proach is similar to that for multi-lexeme translations given in Section 
3- It the support verb is deleted from the SL transfer structure, the 
adjective 'thirsty' is translated into the TL noun (serf in our case), and 
an appropriate TL support verb is incorporated into the TL sentence 
during generation. Information regarding which support verb a noun 
requires is encoded in its lexical entry. 

Support verbs can be of five types: neutral (e.g. 'is thirsty'), dura- 
tive (e.g. 'remain thirsty'), inchoative (e.g. 'get thirsty'), terminative 
(e.g. 'stop being thirsty') and iterative (e.g. 'be thirty again'). We 
will consider neutral support verbs only although the other categories 
could also be handled through bi-lexical rules. One difference between 
the present approach and that of Danlos et al. is that we equate the 
noun 'thirst' with the noun serf in the bilexicon, rather than equating 
an adjective and a noun, thus factoring category and support verb 
differences: 

thirst <-> sedlj; 

We believe this reflects more truly the translation relation that exists 
between the two lexical items. An English- Spanish bi-lexical rule 
is then introduced to derive the adjective on the English side and to 
include the neutral support verb 'be'; on the Spanish side the support 
verb tener, for the noun serf, is introduced: 
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N x <-> Nfntr^tenerja; 

JJ. adjective JJ- identity 

bele^^ A[ntrl=be]j, <-> tenerle^.j, N[ntrl=tener] y 

Note that we underspecify the support verb for the input English 
noun to allow 'John has an unquenchable thirst' and similar examples. 
The neutral (ntrl) control verb required by the English adjective is 
included in its lexical entry's Qualia structure. Thus, a fuller TFS for 
'thirsty' is: 

adjective 
ORTH = thir sty 

SYN 



syn 



QUALIA 



qualia 



SUPP-VERBS 



supp-verbs 

NTRL = be(e,s,y) 
INCH = get(e,s,y) 



LANG 
SEM = 



= english 
thirsty l(y) 



In designing an appropriate Qualia structure we have added to the 
roles proposed by [Pustejovsky (1991|) (Constitutive, Formal, Telic and 
Agentive) in order to incorporate information necessary for capturing 
particular phenomena ( |Johnston et al. 1994| ). 

When translating 'John is thirsty', the analyser constructs the 
transfer representation: 

johnli bel2,i,3 thirstyi3 

We include the support verb 'be' in our representation, even though 
it has empty semantics, in order to encode scoping information - i.e. 
to prevent 'John is a painter' translating as 'a painter is John'; this 
rather ad hoc solution could be replaced by a mechanism analogous 
to the labels used in Underspecified Discourse Representation Theory 
(IReyle 1995| ; [Frank and Reyle 1995| ). 

During transfer, the bi-lexical rule above is applied to the bi-lexical 
entry for 'thirst' to yield: 



beL 



tenerl eiSja ., sedl^ 



, e ^ x , thirstyla 

This multi-lexeme relation is used to translate 'is thirsty' into tiene 
sed; a separate entry translates 'John' into Juan. Bag generation 
then ensures that the TL bag yields a sentence which satisfies the 
constraints specified by the TL grammar. 

The intuitive description of the above process is that we consider 'is 
thirsty' not to be translatable compositionally, but instead to require 
a multi-lexeme translation. The purpose of bi-lexical rules then is to 
minimize the repetition of information in the bi-lexicon while allowing 
the exploitation of monolingual lexical processes. 
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4.3 Lexicalization Patterns 



There are other translation phenomena which can be described through 
the use of bi-lexical rules. Consider lexicalization patterns for example 
flTalmy 1985Q : 



John swims across the river. 
Juan cruza el rio nadando. 

In the English sentence, the main verb encodes manner (i.e. swim- 
ming) and motion, while in Spanish it encodes path (i.e. across) and 
motion; the remaining meaning component in each case is expressed 
through a modifier. Talmy attributes these distinctions to differences 
in lexicalization patterns between the two languages. 

A previous approach to such translations has been to introduce 
the bilexical entries 'swim - nadar + andd and 'across - cruzar 3 
(|Bcaven 1992| ). This approach, however, only implicitly acknowledges 
that theses two translations are only appropriate in conjunction, and 
that separately they are in fact unintuitive. This not only increases 
the non-determinism of transfer and generation, but can increase the 
likelihood of incorrect translations for unseen sentences. In the bi- 
lexical rule view, one relates verb translations to translations incor- 
porating lexicalization patterns as follows: 

Ve, S ~ K, s 

JJ- identity JJ. gerund 

V/,t across!/^ <-> cruzar/^^ V[vform= ing]y t 

This rule derives, for every (movement) verb translation, a multi- 
lexeme translation which includes 'across' as a modifier (we leave the 
restriction on verbs to movement events implicit; also, a simplified 
description of 'across' is assumed flTrujillo forthcoming! ) ). 



Application of this rule to 'swim - nadar' may be depicted as 
follows: 

swiml es <-» nadarl es 

IJ. identity JJ. gerund 

swimly.t across!/^ <-> cruzar^t^ nadandol/^ 

Lexicalist translation of 'John swims across the river' can then proceed 
by translating 'swims across' with the output of this rule and the 
remaining elements of the input via other bilexical entries. 

4.4 Head Switching 

The phenomenon of head switching in translation can be exemplified 
by the following pair of sentences: 

John just arrived. 
Juan acaba de llegar. 
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The problem with such translations is that the syntactic head in the 
SL sentence is not the syntactic head in its translation. This is a ma- 
jor obstacle for syntactic and even some semantic based translation 
systems because of the recursive nature of their transfer representa- 
tions. 

Head switching has been given a number of solutions in a variety 
of systems ( [Kaplan et al. 1989| ; |Sadler and Thompson 199 It [Russell 
et al. 1991| ; |Whitelock 1992]: |Kaplan and Wedekind T99p . In our 
framework, the solution is expressed by the following rule 0: 

JJ. identity 4J- infinitive 

justl/ V/,< acabar.del/,*,/ Vj t 

Application to the bilexical entry 'arrive - llegar 1 results in: 
justl/ , arrivel/,4 <-> acabar.del/^ j , llegarl/^ 

Lexicalist translation progresses as before. To exemplify the use of bi- 
lexical rules in head switching, we consider translation in embedded 
contexts in more detail now. To translate between: 

Mary thinks John just arrived. 

Maria piensa que Juan acaba de llegar. 

the parser constructs the following representation (again, ignoring 
issues of scope and quantification): 

maryli , think^i^ , johnl3 , justly , arrivel4 j 3 

Assuming appropriate transfer of 'Mary' and 'John', translation of 
the embedded clause obtains as follows. 'Thinks' is translated by the 
following entry: 

thinkl es j pensar_quel ejSi / 

In addition, the output of the previous bi-lexical rule serves for multi- 
lexeme transfer of 'just arrive' to give the incomplete bag: 

{ pensar_quel2,i,4 , acabar_del4 j 3 i 4 , llegarl4 j 3 } 

The final result of transfer is the TL bag: 

{ mariali , pensar_quel2,i,4 , juan±3 , acabar_del4 : 3,4 , llegar^^ } 

During generation, acabar de is made the syntactic head of the sen- 
tence through grammatical constraints in the Spanish grammar. Il- 
lustrative rules might be: 



„ , NP S VP e . s 
\T,.., > Vvp e , s ,c \ T, 



3 We ignore the (complex) issue of tense for this type of example of head switching; we expect 
that it can be tackled independently of the present approach. 
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If pensar^que has category Vs, and acabar_de has category Vvp, there 
is only one ordering of the TL bag by which the constraints indicated 
by this small grammar can be satisfied, namely, the order given by its 
translation: 

Maria piensa_que Juan acaba_de llegar. 

It may be noticed that head selection by the TL grammar is pos- 
sible because the event semantic constants in acabar_de and llegar are 
the same. The consequence of this is that modifiers which apply to 
'just arrived' and 'arrived' separately will be indistinguishable during 
TL generation. Avoiding this problem entails transferring scoping do- 
mains for modifiers in order to constraint generation. However, we 
have no readily implementable mechanism for achieving this in LMT 
as yet. 

This concludes our overview of the different translation mismatches 
that may be handled through bi-lexical rules. We now consider some 
unresolved issues arising from their use. 

5 Bi-lexical Rule Interaction 

One difficulty we have found with bilexical rules has been their com- 
position. For example, consider the following translation: 

1) John marched the soldiers across the valley. 

1') Juan le hizo cruzar el valle a los soldados marchando. 

In our framework, two bi-lexical rules should be applied in such cases: 
one to construct causative translations ( |Comrie 1985| ; [Levin and Rap- 
paport 1995|) : 



The other to deal with differences in lexicalization patterns such as 
'march across - cruzar marchando 1 . The problem is that in isolation 
neither of these rules could perform the above translation. Ideally 
one should be able to use the output of one as input to the other 
to derive 'march across - hacer cruzar marchando', but this is not 
possible because both bi-lexical rules expect a mono-lexeme bilexical 
entry. 

One possible solution is to manually add further bi-lexical rules 
which incorporate the composition of other rules: 

V «-► V 

JJ. causative JJ. gerund 

V across 1 «-> hacer 1 cruzar 1 V' 



march! 

JJ. causative 

marchl CQUSQt . 



<-» mar chart 

JJ. infinitive 

<-» hacerl maxclaaxUn finitive 
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However, this solution leads to a combinatorial explosion in the num- 
ber of bi-lexical rules. 

The line of work we are investigating combines bi-lexical rules with 
the context variables given in Section |37|. There remain problems in 
our implementation, however, which will be evident from the following 
description. In our proposed approach either the causative or the 
lexicalization pattern bi-lexical rule, or both, incorporate a context 
variable in their output bilexical entry. For example, assume that the 
variable is included in the causative rule: 
V <_> v' 

JJ. causative JJ. infinitive 

(V) <-» hacerl (V) 

This rule says that whenever there is a verb bilexical entry, there is 
also an entry which in the context of a causative verb introduces hacer 
in the TL bag. Applying the rule to 'march - marchar 1 gives: 
march! <-> marcharl 

JJ- causative JJ- infinitive 

2) (marchl cau;sat „ e ) <-> hacerl (marcharlj n /; n<t ;,,e) 
Lexicalist transfer of £ march causat j. ye ... across via the output of 
this rule and that for lexicalization patterns proceeds as follow: the 
causative reading of 'march' unifies with the context lexical sign in 2) 
but is not translated by it. The TL side therefore only contributes 
hacer to the final TL bag. Via the bi-lexical rule given in Section 



4~3| , 'march across' is transferred such that cruzar and marchando 
form part of the final TL bag. The result is therefore hacer cruzar 
marchando, which, in combination with the translation of the rest of 
the sentence can form the basis for bag generation. 

Our main problem is that of resolving conflicts between the syn- 
tactic constraints imposed by each bi-lexical rule. The causative rule 
requires the Spanish side to include an infinitive verb, while the lex- 
icalization pattern rule requires a gerundive verb. Clearly both con- 
straints cannot be satisfied for the same lexical sign marcharl. The 
problem reflects itself in our proposal in that the rule which includes 
the contextual pattern must be chosen carefully. If the lexicalization 
pattern rule rather than the causative rule had included the contex- 
tual verb lexical sign, the gerundive marchando could not have been 
generated. Instead, a sentence analogous to 'John made the soldiers 
march crossing the valley' would result, which is perhaps not desir- 
able. In other words, the conflict between gerundive and infinitive 
morphology for 'march' is decided manually in advance. The inter- 
action of such decisions with other bi-lexical rules therefore might be 
unpredictable, and hence is left for further investigation. 
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6 Implementation 



The implemented prototype system contains approximately 250 bilex- 
ical entries; this figure includes 20 proper names, 20 multi-lexeme 
translations and 6 contextual rules. The following translations were 
done on a SUN Sparc workstation using Allegro Common Lisp. The 
time taken to find all possible TL sentences is given in seconds; total 
times are for CPU + typical garbage collection times. 



Translation 


Total (CPU) 


John thinks Mary just arrived 




Juan piensa_que Maria acaba_de llegar 


50 (28) 


John swam across the river 




Juan cruzo el rio nadando 


19 (16) 


John marched the soldiers 




Juan hizo marchar a los soldados 


19 (17) 



These timings are only intended to give some idea of the type and 
stage of our implementation, rather than reflect the performance of 
an optimized system. 



7 Conclusion 

We have introduced the mechanism of bi-lexical rules for incorporat- 
ing lexical rules in MT. These rules establish correspondences between 
bilexical entries such that given the presence of one entry, the exis- 
tence of another bilexical entry can be inferred. We presented various 
phenomena that can be described using such rules: noun sense exten- 
sions, support verbs, lexicalization patterns and head switching. The 
rules provide a useful and motivated extension to the LMT paradigm 
by providing it with a uniform approach to the description of a num- 
ber of translation phenomena. 

The problems arising from conflicting constraints imposed by dif- 
ferent translation relations are described, and a partial solution to 
these was offered involving the combined use of bi-lexical rules and 
contextual variables. 

Future work could consider implementing Mel'cuk's lexical func- 
tions ( [Heylen et al. 1994[ ) in a manner similar to the way bi-lexical 



rules were used in the translation of support verbs. 
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